A Latent Variable Recurrent Neural Network for Discourse-Driven Language Models
نویسندگان
چکیده
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-ofthe-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.
منابع مشابه
A Latent Variable Recurrent Neural Network for Discourse Relation Language Models
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations that link adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which ...
متن کاملLatent words recurrent neural network language models
This paper proposes a novel language modeling approach called latent word recurrent neural network language model, which solves the problems present in both recurrent neural network language models (RNNLMs) and latent word language models (LWLMs). The proposed model has a soft class structure based on a latent variable space as well as LWLM, where the latent variable space is modeled using RNNL...
متن کاملA Recurrent Latent Variable Model for Sequential Data
In this paper, we explore the inclusion of latent random variables into the hidden state of a recurrent neural network (RNN) by combining the elements of the variational autoencoder. We argue that through the use of high-level latent random variables, the variational RNN (VRNN)1 can model the kind of variability observed in highly structured sequential data such as natural speech. We empiricall...
متن کاملText Generation Based on Generative Adversarial Nets with Latent Variable
In this paper, we propose a model using generative adversarial net (GAN) to generate realistic text. Instead of using standard GAN, we combine variational autoencoder (VAE) with generative adversarial net. The use of high-level latent random variables is helpful to learn the data distribution and solve the problem that generative adversarial net always emits the similar data. We propose the VGA...
متن کاملZ-Forcing: Training Stochastic Recurrent Networks
Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequ...
متن کامل